Scale-free networks with an exponent less than two 
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We study scale free simple graphs with an exponent of the degree distribution 7 less than two. 
Generically one expects such extremely skewed networks - which occur very frequently in systems of 
virtually or logically connected units - to have different properties than those of scale free networks 
with 7 > 2: The number of links grows faster than the number of nodes and they naturally posses 
the small world property, because the diameter increases by the logarithm of the size of the network 
and the clustering coefficient is finite. We discuss a simple prototype model of such networks, 
inspired by real world phenomena, which exhibits these properties and allows for a detailed analytical 
investigation. 
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There has been a recent surge of interest on the net- 
work structure which underlie many real world phenom- 
ena pj . This is partly because network's topology plays a 
key role in their understanding and partly because of the 
ubiquity of few generic features such as the small world 
property 0] and scale-free distribution of degrees Q ■ The 
latter has been observed for example in the World Wide 
Web 0, Citation network p|, Protein Interaction Net- 
work |fj, film actors |2(, electronic circuits [3. Indeed in 
each of these systems nodes - web pages or actors - are 
linked - by hyperlinks or collaboration in the same movie 
- to a number k of other nodes, which is called the degree 
of the node [2tJ , and which obeys a power law distribu- 
tion P(k) ~ k 1 . In many cases (tabled the exponent 7 
of such a distribution is larger than two which its occur- 
rence has been related to some interaction mechanism - 
such as preferential attachment 0] - in simplified models. 

Scale-free networks with an exponent 7 < 2 have re- 
ceived less attention, despite of their widespread appear- 
ance (table Pi, in the peer-to-peer Gnutella network 

|, outgoing E-mails network traffic in networks 
co-authorship network in high energy physics |T^ 
and in the network of dependency among software pack- 
ages EH . 

The aim of this letter is to show that simple graphs 
with 7 < 2 have markedly different properties than sim- 
ple graphs with 7 > 2. We shall do this first on the 
basis of general arguments and then using a prototype 
model motivated by the above mentioned real networks. 
This model reproduces all the discussed generic proper- 
ties. Furthermore we show that its generalization to a 
weighted network exhibits non-trivial statistical proper- 
ties. 

Generic properties - We focus on simple graphs with 
uncorrelated degree distribution. In the ensemble of Ref. 
|15| , where the probability of a link between nodes i and 
j is pij = 1 — e fc i fc j/(™( fc )) j where (k) — J2i^i/ n ^ s the 
average degree, nodes with degrees fcj ~ \J n (k) cannot 
be considered as independent. The degrees of a simple 
scale free gra ph are uncorrelated only if a structural cut- 
off k c (n) ~ y/n (k) is imposed in the degree distribution. 



Random uncorrelated networks with 7 < 2 differ fun- 
damentally in their topology from networks with 7 > 2. 
Indeed, 7 < 2 implies that the average degree increases 
with the system size (k) ~ n^, which means that the total 
number of links grows faster than the number of nodes. 
This in turn means that the cutoff k c (n) diverges with 
the system size in a non-trivial manner. When 7 > 2 
the mean degree (k) is finite and hence k c (n) ~ n 1 ' 2 . 
On the contrary, for 7 < 2 the divergence (k) ~ im- 
plies that the structural cutoff scales with system size n 
as k c (n) ~ n' 1+ ^/ 2 . This and the explicit calculation of 
(k), leads to 

f=(2-7)/7- (1) 

Correlated networks with a cutoff k c (n) ~ n x which di- 
verges faster with n will exhibit an even faster divergence 
of (k), with £ = x(2 - 7). 

Uncorrelated networks with such a broad distribution 
of degrees are expected to have a high clustering coeffi- 
cient. The clustering coefficient is the ratio of number of 
loops of size 3 [Ig, UJ$ to the number of triples of con- 
nected nodes, which is J2i^i(^i ~ !)■ So using Eq. (JTJ 
and the fact that (fc 2 ) ~ & 3 ~ 7 , we find a finite clustering 
coefficient: 



C - — 5— const. 

{kfn 



(2) 



By contrast, the same argument implies a vanishing clus- 
tering coefficient C ~ n 2 ~ 7 for 7 > 2. 

Such a high clustering is consistent with the presence of 
a high density core: Indeed a finite fraction of nodes are 
within a distance log log n one from the other jf8:]. Still 
the diameter of the network is of order log n. Indeed there 
is a finite number of nodes with degree fcj = I and 2 and 
these form chains which connect to the core, whose length 
is exponentially distributed. Hence the longest chain has 
length £ rnax ~ logn, and it dominates the behavior of the 
diameter. Similar arguments were also used in reference 
[l8l | for graphs with 7 > 2. 
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The model - Here we study in detail a prototype model 
of networks with 7 < 2 motivated by the real systems 
discussed above. Our model is based on the idea of ag- 
gregation |19| and it is very similar to one recently and 
independently introduced in Ref . [2(1 l2~i| in a different 
context, and analyzed partly by Alava and Dorogovtsev 
[22l| . We show that its statistical properties can be fully 
understood analytically and that they reproduce success- 
fully the properties observed in real world networks with 
7 < 2. Furthermore the model shows that, in networks 
with 7 < 2, the statistics of strength of weighted net- 
works can be highly non-trivial and very different from 
its counterpart in networks with 7 > 2. Therefore, we 
hope the model may serve as a starting point to under- 
stand more complex cases as well as to address different 
issues, such as the efficiency of search algorithms [23j, 
routing, traffic flow and transmission of infections 
on peer-to-peer networks. 

We consider a network of n nodes and, in each time 
step, we perform the following two steps: 

1. Creation: We create a new node and connect it to 
a randomly chosen node. 

2. Merging: We merge two randomly selected nodes. 
If the two nodes were already connected, the cor- 
responding link is removed. Likewise we remove 
multiple links with common neighbors of the two 
nodes. 

The first move is like creating a new software package, 
e-mail address or running a new instance of Gnutella. 
The second move can be related to merging two packages 
or abandoning one in the favor of another, merging two 
e-mail accounts or shutting down a Gnutella client-server 
and giving its load to another one. 

The model describes a stationary network with a fixed 
number of nodes. If the second process is run at a smaller 
rate than the first, the model describes a growing network 
(see Ref. [22j where a similar extension has been ana- 
lyzed). Actually, to perform our simulation, we started 
from a graph with a couple of nodes, then we permitted 
it to grow by allowing more creation than merging until 
it reached a given size. After that we merged and created 
nodes sequentially to keep the number of nodes fixed and 
we continued it until the system reached the stationary 
state of the average degree. At that point we started tak- 
ing snapshots of the network with a given interval that 
was enough to give us a thousands of independent struc- 
tures. The interval between sampling was about the same 
time as we had waited to reach the stationary state. We 
repeated the process for different network sizes. Results 
are reported in figure ^ and table [I] compares the char- 
acteristics of our networks to the one of real world cases. 
The degree distribution P(k) follows a scaling function of 
the form P(k) = k'if{k/n a ) with 7 ~ 1.5 and a ~ 0.67 
where n indicates the total number of nodes and k the 
degree of the nodes. Here f(x) is a scaling function with 
f(x) ~ const when x <C 1 and with f(x) decaying faster 
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FIG. 1: Collapse plot of degree distribution for networks of 
different size. The dashed line corresponds to a power law 
with exponent —3/2. 
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FIG. 2: Main figure shows the average of weight of links that 
are connected to nodes with a degree less than a given value 
of fco and the inset shows the histogram of strength of nodes. 
The legends show the number of nodes in each case. 



than any power of x for x 3> 1. In our model, since we 
have an exponent 7 < 2 also the total number of links 
m follows a power-law of the form m = n^ +1 with the 
exponent £ ~ 0.33 > 0, at odd with most studied mod- 
els with 7 > 2 for which £ = 00, ^| . The exponents 
found above agree perfectly with the exponent relations 
a = (1 + £)/2 and Eq. (IJ. Moreover, we found that the 
networks produced by the above dynamics have the small 
world properties: their diameter grows as log ri with sys- 
tem size whereas clustering coefficient does not decrease 
as n increases, in agreement with Eq. 

Weighted network - It is also interesting to consider 
a model of weighted networks with the above dynamics. 
The idea, for example, is that if the link between two soft- 
ware packages i and j means that package i calls package 
j, it might also be interesting to keep track of how many 
times i calls j. Hence we associate a weight to each link 
ij and assume that it evolves according to the following 
rules: 

• A fresh link that connects a new node to the net- 
work has weight one. 
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• When merging two nodes i and j which are both 
linked to the same node k, as before we only keep 
one link, and its weight is the sum of the weights 
of the previous links. 

In the previous example, when two software packages are 
merged, the new package inherits all the calls to a third 
piece of software of the merging packages. Likewise, when 
two e-mail account are merged, we assume that the traffic 
of e-mails to a third account will be the sum of the traffic 
originating from the two accounts before the merge. This 
neglects the presence of complementarities, which can be 
an important issue in some cases, but is the most natural 
way to introduce weights in the model. Weights allow us 
to define the strength of a node in the usual way [24[> i- e - 
as the sum of the weights of outgoing links. 

The sum of all the weights increases when we add a 
node, and it decreases when we merge two nodes that 
are connected; therefore, one can expect it to reach the 
steady state. This was confirmed by simulation, which 
also shows that the distribution of the strengths decays 
as a power-law with an exponent 1.5. This would be 
consistent with a linear relation of strength versus degree, 
but Fig. |2]shows that such a relation only holds for small 
k and that most of the weight concentrates on high degree 
nodes. 

Analytic approach - It is possible to shed light on these 
finding and to calculate the exact value of the exponents 
for this model, following similar arguments to those of 
Ref. |23| . We can combine the two operations above in 
a single one where we replace two nodes i and j by two 
nodes of which i inherits all the links (incoming and out- 
going) of both nodes and j looses all links, and acquires 
a new link to a randomly chosen new node[2!|. If k[ and 
k'j are the degrees of the two nodes after the process, we 
have 



equation 



+ kj — nii 
k'i = 1 



(3) 



where ay = 1 if the link ij exists and rriij — Y] « auaji 
is the number of sites who were linked to both i and j. 
Given that i and j are chosen at random, rrijj and ay 
can be regarded as random variables. The probability 
that the link ij exists is (ay) = kikj/(n (k)), likewise the 
average number of nodes connected to both i and j is 



{m. 



E 



ki k j 
n (k) n (k) 



fjik^kj 



(4) 



where /1 = (fc 2 ) / 'n (k) 2 . Let us now introduce the gener- 
ating function for the degree distribution 



1 N 



In the stationary state, we can use Eq. J3J to derive the 



*(z) 



= -E(z 
2 V 



= - [z + E \z k *+ k i e ^ k M*) (l + rjkikjhiz))^ 

where r\ = l/(n {k)), h(z) = (1 — z) / z and the last equal- 
ity hinges upon the observation that my is a Poisson 
variable with mean given by Eq. Q and that ay is a 
random bit with (ay) — rjkikj. Now we observe that 
both /i and ?; — > as n — ► 00, consequently $(z) can be 
expanded in a power series in [i and r\. The leading term 
(jjl = r) = 0) yields 2<E>(z) = $ 2 (z) + z, i.e. 



$(z) = 1 - Vl - z = 



21X1/2; 



E 

fc=i 



r(fc-i/2) , 



fc! 



(•5) 



rffc-1/2) 



Therefore, for n — > 00, we find P(k) = 2 r(i/2) B " 

fc~ 7 with 7 = 3/2. The exponent relations derived earlier 
can then be used to conclude that a = 2/3 and £ = 1/3. 
This conclusion is also supported by a direct calculation 
of the next terms in the small \x expansion. These finite 
n corrections introduce a finite cutoff k c ~ n <T in the 
distribution, but leads to cumbersome formulas which 
we will not detail here. A further way to compute a 
comes from observing that the average of Eq. in the 
stationary state yields 1 = (k 2 ) /n+ (k) /n, i.e. (k 2 ) ~ 
n. This combined with the relation (k q ) ~ rf^ 
implies a = 2/3. This shows that the exponent relations 
(7 = (1 + £)/2 and Eq. - which are valid for random 
graphs - can be explicitly verified in this model. 

A simple argument also allows us to understand the 
statistics of weights. Indeed at each time step, a new 
link with weight w = 1 is added. At the same time, the 
weight of the link between nodes i and j, if present, is 
removed. In the stationary state, then we expect that 
the probability (k) jn of an existing link to be chosen, 
times its average weight (w) must be equal to one. Hence 
(w) ~ n/ (fc) ~ ?i 2 / 3 . Unlike for the degree distribu- 
tion, we do not expect a cutoff in the distribution of 
weights 30]. Assuming that P(w) ~ w~ v ~ 1 with rj < 1, 
we know that (w) ~ (n (fc)) 1 / 17 " 1 . Combining this with 
(w) ~ 7i 2 / 3 we find that rj = 2/3, in perfect agree- 
ment with numerical simulations. Concerning the node's 
strength Sj, in order to explain the behavior of Fig. [^Jit is 
crucial to observe that nodes with fcj -C k c will have links 
with weights of order one. Indeed, merge events in which 
nodes i and j share some of their neighbors are rare if 
(my) = {hj {k))(kj/ (fc)) < 1, ki < k c or if kj < k c . 
We therefore expect that links belonging to nodes with 
ki <C k c have weights of order one, i.e. that s, ~ fcj. For 
ki ~ k c instead the additive process of weights on links to 
shared neighbors becomes relevant, eventually leading to 
a very broad distribution of weights on such nodes. This 
is a rather non-standard situation compared to that of 
most weighted networks with 7 > 2 [24[. 

Conclusions - We have discussed the properties of com- 
plex scale free networks with degree distribution expo- 
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nent 7 < 2, which characterizes many real systems. We 
have shown that these properties are reproduced by a 
simple prototype model motivated by such real systems. 
A key characteristic of this class of networks is that their 
average degree grow with the system size, which suggests 
that making a link is inexpensive. This is indeed the case 
for networks of software packages. In fact it is costly to 
make a package, but it is costless to use an already ex- 
isting package. Interestingly, a peculiarity of the model 
is that it involves global moves. This requires some sort 
of global information exchange mechanisms, that is not 
part of the network itself, that allows nodes to interact 
globally. In the example of software packages, this infor- 
mation exchanges happens among programmers, in fact 
they are responsible for the evolution of the system and 
they do not exchange information only through the sys- 
tem. While both properties are likely to hold only for 
open source packages, they might not apply to commer- 
cial software, which might be expensive to link to. A 



further problem is that statistical information on com- 
mercial software dependencies is not available. These 
two features also characterize other networks: for exam- 
ple, in Gnutella each node is a computer. But each link 
is only a logical connection between two computers and 
does not require any additional hardware. In the case of 
Gnutella network there are web caches that store the in- 
formation of nodes and share them with other nodes but 
these caches are not considered as a part of the network 
itself. It is tempting to conjecture that the relation be- 
tween these two properties and networks with exponent 
7 < 2 is generic. This, applied to co-authorship network, 
suggests that global interaction and information diffusion 
plays an essential role in establishing a dense collabora- 
tion network. 
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TABLE I: Results of our simulation and its comparison to 
some empirical observations. In the case of directed network 
the exponents is shown in the form of in/out. Here the total 
number of nodes, links, the exponent, clustering coefficient, 
mean of shortest paths are represented by n, m, 7, C, I. 
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